Adding Linguistic Knowledge to a Lexical Example-Based Translation System
نویسنده
چکیده
Example-Based Machine Translation (EBMT) using partial exact matching against a database of translation examples has proven quite successful, but requires a large amount of pre-translated text in order to achieve broad coverage of unrestricted text. By adding linguistically tagged entries to the example base and permitting recursive matches that replace the matched text with the associated tag, substantial reductions in the required amount of pre-translated text can be achieved. A modest investment of time on the order of two person-weeks adding linguistic knowledge reduces the required example text by a factor of six or more, while retaining comparable translation quality. This reduction makes EBMT more attractive for so-called "low-density" languages for which little data is available.
منابع مشابه
A Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملModeling a Quite Different Machine Translation using Lexical Conceptual Structure
The goal of this study is to outline the readability of an Example-Based Machine Translation for any pair of languages by means of the language-independent properties of the lexical conceptual structure (LCS). We describe LCS as a representation of traditional dependency relationships and use in experiments an isolated pair of verbs, extracted from Orwell's " 1984 " parallel English – Romanian ...
متن کاملMultilingual Lexical Representation
The approach to multilingual lexical representation developed as part of the ACQUILEX Lexical Knowledge Base (LKB) discussed with specific reference to complex translation equivalence. The treatment described provides a lexicalist account of translation mismatches in terms of translation links which capture cross-linguistic generalizations across sets of semantically related lexical items, and ...
متن کاملManipulation of Ideology in Translation of Political Texts: A Criti-cal Discourse Analysis Perspective
As a culture-based phenomenon which involves both linguistic and social aspects, translation has been investigated from various perspectives. The present Critical Discourse Analysis (CDA)-based study is an attempt to probe into the manipulation of ideologies in translations of political texts. A CDA approach, based on Fairclough (1989), Van Dijk (2004) and Farahzad (2007), was adopted to conduc...
متن کاملA System for Compound Noun Multiword Expression Extraction for Hindi
Compound noun multiword expressions are important for many NLP applications like machine translation and information retrieval. This paper describes a system for Hindi compound noun multiword expressions (MWE) extraction from a given corpus. We identify major categories of compound noun MWEs, based on linguistic and psycholinguistic principles. Our extraction methods use various statistical co-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999